Morphological Skip-Gram: Replacing FastText characters n-gram with morphological knowledge
نویسندگان
چکیده
منابع مشابه
Incremental Skip-gram Model with Negative Sampling
This paper explores an incremental training strategy for the skip-gram model with negative sampling (SGNS) from both empirical and theoretical perspectives. Existing methods of neural word embeddings, including SGNS, are multi-pass algorithms and thus cannot perform incremental model update. To address this problem, we present a simple incremental extension of SGNS and provide a thorough theore...
متن کاملSkip-Gram - Zipf + Uniform = Vector Additivity
In recent years word-embedding models have gained great popularity due to their remarkable performance on several tasks, including word analogy questions and caption generation. An unexpected “sideeffect” of such models is that their vectors often exhibit compositionality, i.e., adding two word-vectors results in a vector that is only a small angle away from the vector of a word representing th...
متن کاملRiemannian Optimization for Skip-Gram Negative Sampling
Skip-Gram Negative Sampling (SGNS) word embedding model, well known by its implementation in “word2vec” software, is usually optimized by stochastic gradient descent. However, the optimization of SGNS objective can be viewed as a problem of searching for a good matrix with the low-rank constraint. The most standard way to solve this type of problems is to apply Riemannian optimization framework...
متن کاملExploring phrase-compositionality in skip-gram models
In this paper, we introduce a variation of the skip-gram model which jointly learns distributed word vector representations and their way of composing to form phrase embeddings. In particular, we propose a learning procedure that incorporates a phrasecompositionality function which can capture how we want to compose phrases vectors from their component word vectors. Our experiments show improve...
متن کاملA Closer Look at Skip-gram Modelling
Data sparsity is a large problem in natural language processing that refers to the fact that language is a system of rare events, so varied and complex, that even using an extremely large corpus, we can never accurately model all possible strings of words. This paper examines the use of skip-grams (a technique where by n-grams are still stored to model language, but they allow for tokens to be ...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
ژورنال
عنوان ژورنال: Inteligencia Artificial
سال: 2021
ISSN: 1988-3064,1137-3601
DOI: 10.4114/intartif.vol24iss67pp1-17